March, 2020

Testing Causal Claims

Last time we met…

We highlighted the difference between deterministic and probabilistic causal claims.

Probabilistic causal claims are usually answers to questions about the effects of causes.

The events surrounding the ongoing COVID-19 pandemic have produced many questions about effects of causes.

[Stop here and read the New York Times article required for this module]

You should be reading!

Public Health Emergency

As in the US and elsewhere, the provincial and municipal governments in British Columbia have issued states of emergencies and many public health orders:

  • Schools closed indefinitely
  • Large gatherings banned
  • Public facilities closed
  • Self-quarantine rules
  • Closing beaches and parks
  • Removing logs from public beaches
  • And more…

Public Health Emergency

What are the effects of these public health emergency measures?

Will they achieve their desired effect and “flatten the curve”?

  • In these slides, we will outline how we begin to assess effects of causes.

Outline

  1. Causal claims must imply empirical hypotheses that we can test

  2. But even if variables are observable, we can’t observe counterfactual states of the world: Fundamental Problem of Causal Inference
  3. One solution is to look at the observed pattern (or correlation) between the independent and dependent variables
  4. Correlation can suffer from different problems
    • Random error: need to assess whether the pattern appeared by chance
    • Bias: the observed correlation is not the true causal relationship

Causal Hypotheses

Variables and Causal Claims:

Just like with descriptive claims, we need to convert concepts in causal claims into variables.

Independent variable:

The variable capturing the alleged cause in a causal claim.

  • often called the “IV” or “X” or “right-hand variable”

Dependent variable:

The variable capturing the alleged outcome (what is affected) in a causal claim.

  • often denoted as “DV” or “Y” or “left-hand variable”

Examples: (question)

Do emergency public health orders “flatten the curve”?

Independent Variable: ?

Dependent Variable: ?

Examples: (answer)

Do emergency public health orders to social distance “flatten the curve”?

Independent Variable (X): E.g., the number of mandatory orders to impose social distancing Or, mandatory closure of public schools (yes or no)

Dependent Variable (Y): Highest weekly mortality rate per 100k over the course of an epidemic

Hypotheses

We make causal claims testable by generating…

hypotheses (or empirical predictions):

these are statements about what we should observe if the causal claim is true.

Hypotheses state the expected relationship between independent and dependent variables implied by the causal claim. For example:

  • If \(X\) were present(absent), then \(Y\) would be present(absent)
  • If \(X\) were present(absent), then \(Y\) would be more(less) likely
  • If \(X\) were to increase(decrease), then \(Y\) would increase(decrease)

Hypotheses

Causal hypotheses are, in principle, testable because they are stated in terms of variables.

But, remember: causality is counterfactual.

So, causal hypotheses are statements about what values the dependent variable would be if the independent variable were at different values.

In other words: causal hypotheses are statements about what the potential outcomes are, if the causal claim is true.

Example:

claim: “Government mandated social distancing”flattens the curve“.”

independent variable: number of government public health mandates issued in a place

dependent variable: Highest weekly mortality rate per 100k over the course of an epidemic

What is the empirical prediction/hypothesis?

Example

claim: “Government mandated social distancing”flattens the curve“.”

independent variable (X): number of government public health mandates issued in a place

dependent variable (Y): Highest weekly mortality rate per 100k over the course of an epidemic (peak mortality rate)

hypothesis: If the government had issued more orders to social distance (X) there would have been lower peak mortality rates (Y).

What is the problem here?

Consider this hypothesis when applied to our current situation:

If they government issued more orders to social distance, there would be lower peak mortality rates

This implies this relationship between potential outcomes (link):

\[\begin{equation} \begin{split} \mathrm{Peak \ Mortality \ Rate}_{BC}(\mathrm{More \ Social \ Distance}) < \\ \mathrm{Peak \ Mortality \ Rate}_{BC}(\mathrm{Same \ Social \ Distance}) \end{split} \end{equation}\]

Let’s assume that the government is likely to impose stricter rules and will probably issue a “shelter-in-place” order. If that happens, which of the above potential outcomes will we not observe? Let’s assume that the government is likely to impose stricter rules and will probably issue a “shelter-in-place” order. If that happens, which of the above potential outcomes will we not observe?

(press p for notes)

What is the problem here?

If, as is likely, the provincial government continues to impose stronger social distancing rules, then the result is:

\[\begin{equation} \begin{split} \mathrm{Peak \ Mortality \ Rate}_{BC}(\mathrm{More \ Social \ Distance}) < \\ \color{red}{\mathrm{Peak \ Mortality \ Rate}_{BC}(\mathrm{Same \ Social \ Distance})} \end{split} \end{equation}\]

The potential outcome in \(\color{red}{\mathrm{red}}\) (the peak mortality rate in BC that would occur if we kept the status quo and did not impose more social distancing measures) would remain counterfactual.

We are unable to observe British Columbia in the alternate universe where no further social distancing orders are made. As a result, we cannot directly empirically test the claim that imposing these measures causes a flattening of the curve.

What is the problem here?

This is not unique to public health measures during a pandemic. It is the:

Fundamental Problem of Causal Inference

For any case, we can only observe the potential outcome of Y for the value of X that the case is actually exposed to. We can never observe the other, counterfactual potential outcomes of Y for different possible values of X that the case did not experience.

Because of the counterfactual definition of causality: for any specific case, we can never empirically observe whether X causes Y.

Flattening the Curve

If we can’t observe the counterfactual world in which fewer emergency public health orders to keep us socially distant, how do we know that the charts like this are correct?

Flattening the Curve

If we cannot observe what would happen in the counterfactual world where fewer public health measures were taken, how can we respond to the doubters in the New York Times article or people who are not sure what to think about “flattening the curve”?

Flattening the Curve

Even though the Fundamental Problem of Causal Inference limits our ability to definitively say the causal consequences of public health emergency actions in British Columbia, we can still present plausible evidence as to their effects.

But, if we don’t have access to an alternate universe and thus can’t observe the counterfactual British Columbia that has fewer public health measures:

What can we do?

(press p for notes)

An Example

The 1918-1919 Spanish Flu Epidemic

The 1918-1919 Spanish Flu Epidemic

Cities across the United States were exposed to the influenza epidemic at different times and reacted differently.

Philadelphia, PA: The first case of “spanish flu” was reported on 17 September, 1918. Rather than reacting quickly, city officials “downplayed their significance and allowed large public gatherings, notably a city-wide parade on September 28, 1918” to raise war bonds. “School closures, bans on public gatherings, and other social distancing interventions were not implemented until October 3, when disease spread had already begun to overwhelm local medical and public health resources.” - Hatchett, et al 2007

The 1918-1919 Spanish Flu Epidemic

Cities across the United States were exposed to the influenza epidemic at different times and reacted differently.

St. Louis, MO: The first case of “spanish flu” was reported on October 5, and “authorities moved rapidly to introduce a broad series of measures designed to promote social distancing, implementing these on October 7.” - Hatchett, et al 2007

Compared to Philadelphia, which took more than 2 weeks to take action, St. Louis imposed strict social distancing measures only 2 days after the first cases appeared.

The 1918-1919 Spanish Flu Epidemic

(press p for notes)

The 1918-1919 Spanish Flu Epidemic

Based on this plot, we can see that the “curve” was flatter in St. Louis than it was in Philadelphia.

What is observable here is that:

A city which rapidly implemented social distancing measures (St. Louis) had a relatively lower peak mortality rate (flatter curve) than another city (Philadelphia) which delayed in implementing social distancing measures.

We might be tempted to think: does this mean forced social distancing causes a flatter curve (lower peak mortality rates)?

Correlation

Correlation

In this example of St. Louis and Philadelphia, we have been looking at the correlation of public health interventions and influenza mortality rates

Correlation

Correlation is the degree of association/relationship between the observed values of \(X\) (the independent variable) and \(Y\) (the dependent variable)

  • In statistics courses, there are very specific mathematical definitions of correlation. Here, we will use the term as an umbrella category for many different techniques that examine whether there is a pattern in the relationship between independent and dependent variables.

Correlation

All empirical evidence for causal claims relies on correlation between the independent and dependent variables. This is because never have direct access to counterfactuals.

But, you’ve probably all heard this:

Correlation: Assumption

Whether or not we can treat correlation as evidence of causation depends on whether we believe one big assumption: (more on this next lecture)

key assumption for correlation to show causation:

The cases we observe with different values of \(X\) (independent variable) all have, either exactly or on average, the same potential outcomes of \(Y\) (the dependent variable). In other words: the cases we observe (are factual) with different values of \(X\) are, in essence, like counterfactuals for each other.

Correlation: Assumption

To make sense of this assumption, let’s consider what correlation means in the context of our comparison of Philadelphia and St. Louis.

We compare the two factual/observed potential outcomes: \(\mathrm{Peak \ Mortality \ Rate}_{St. Louis}(\mathrm{More \ Social \ Distance})\) versus \(\mathrm{Peak \ Mortality \ Rate}_{Philadelphia}(\mathrm{Less \ Social \ Distance})\). (We’ll shorten Mortality Rate to MR, Social Distancing to SD)

Correlation: Assumption

But the other potential outcomes (marked in \(\color{red}{\mathrm{red}}\) ) are unknown because they are counterfactual. We don’t know whether more social distancing causes lower peak mortality rates, because we don’t observe these cities in the alternative universe where they adopted a different social distancing policy.

City Less SD More SD
Philadelphia \(\mathrm{Peak \ MR_{Phil.}(Less \ SD)}\) \(\color{red}{\mathrm{Peak \ MR_{Phil.}(More \ SD)}}\)
St. Louis \(\color{red}{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}\) \(\mathrm{Peak \ MR_{St. L.}(More \ SD)}\)

Correlation: Assumptions

For the correlation that we observe between more public health measures and lower peak mortality to imply causality, we need to assume that:

\(1\). In the counterfactual world where Philadelphia had employed more social distancing, Philadelphia’s peak influenza mortality (\(\color{red}{\mathrm{Peak \ MR_{Phil.}(More \ SD)}}\)) would have been the same as the peak influenza mortality in St. Louis, which in reality did employ more social distancing (\(\mathrm{Peak \ MR_{St. L.}(More \ SD)}\)).

So we can let the mortality rate in a city with more social distancing we can observe (\(\mathrm{Peak \ MR_{St. L.}(More \ SD)}\)) stand in for the counterfactual mortality rate we can’t observe (\(\color{red}{\mathrm{Peak \ MR_{Phil.}(More \ SD)}}\))

Correlation: Assumptions

City Less SD More SD
Philadelphia \(\mathrm{Peak \ MR_{Phil.}(Less \ SD)}\) \(\color{red}{\boxed{\mathrm{Peak \ MR_{Phil.}(More \ SD)}}}\)
\(\Uparrow\)
St. Louis \(\color{red}{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)

Correlation: Assumptions

City Less SD More SD
Philadelphia \(\mathrm{Peak \ MR_{Phil.}(Less \ SD)}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)
\(\Uparrow\)
St. Louis \(\color{red}{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)

Correlation: Assumptions

For the correlation that we observe between more public health measures and lower peak mortality to imply causality, we also need to assume that:

\(2\). In the counterfactual world where St. Louis had employed less social distancing, St. Louis’s peak influenza mortality (\(\color{red}{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}\)) would have been the same as the peak influenza mortality in Philadelphia, which in reality did employ less social distancing (\(\mathrm{Peak \ MR_{Phil.}(Less \ SD)}\)).

So we can let the mortality rate we can observe (\(\mathrm{Peak \ MR_{Phil.}(Less \ SD)}\)) stand in for the counterfactual mortality rate we can’t observe (\(\color{red}{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}\))

Correlation: Assumptions

City Less SD More SD
Philadelphia \(\boxed{\mathrm{Peak \ MR_{Phil.}(Less \ SD)}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)
\(\Downarrow\) \(\Uparrow\)
St. Louis \(\color{red}{\boxed{\mathrm{Peak \ MR_{St. L.}(Less \ SD)}}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)

Correlation: Assumptions

City Less SD More SD
Philadelphia \(\boxed{\mathrm{Peak \ MR_{Phil.}(Less \ SD)}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)
\(\Downarrow\) \(\Uparrow\)
St. Louis \(\boxed{\mathrm{Peak \ MR_{Phil.}(Less \ SD)}}\) \(\boxed{\mathrm{Peak \ MR_{St. L.}(More \ SD)}}\)

Correlation: Assumptions

By assuming that we can slot in the observed/factual peak mortality rates from one city for the the counterfactual and thus unobserved peak mortality rates for the other city, we can make the jump from correlation to causation.

The goal of social science is to strongly interrogate that assumption to ensure that we have considered alternatives.

More about Correlation

In many cases, the easiest way to investigate correlation is visually. Typically this is done through a scatterplot.

scatterplots represent cases as points in a two dimensional plane.

  • the \(X\) axis position of the point reflects the observed value the case has on the independent variable
  • the \(Y\) axis position of the plot reflects the observed value the case has on the dependent variable

By looking at the point on the plot, you can read a case’s value on both independent and dependent variables.

More about Correlation

We can visualize the correlation between public health emergency measures and flattening the curve that we’ve been discussing on the following plot:

More about Correlation

More about Correlation

Read the point positions as the value: (X,Y). The red line traces this point back to the y-axis; the blue line traces the point to the x-axis.

More about Correlation

More about Correlation

As is visible in the preceding plot, it is possible to see patterns between independent and dependent variables using a scatterplot.

Correlations can be described in three ways:

  • correlations have a direction:
    • positive: implies that as \(X\) increases, \(Y\) increases
    • negative: \(X\) increases, \(Y\) decreases

More about Correlation

As is visible in the preceding plot, it is possible to see patterns between independent and dependent variables using a scatterplot.

Correlations can be described in three ways:

  • correlations have strength:
    • strong: \(X\) and \(Y\) almost always move together (relationship between \(X\) and \(Y\) is nearly deterministic)
    • weak: \(X\) and \(Y\) do not move together very much (relationship between \(X\) and \(Y\) is very probabilistic and “noisy”)

More about Correlation

As is visible in the preceding plot, it is possible to see patterns between independent and dependent variables using a scatterplot.

Correlations can be described in three ways:

  • correlations have a magnitude:
    • a large magnitude means that as \(X\) changes, \(Y\) changes by a large amount
    • a small magnitude means that as \(X\) changes, \(Y\) changes by a small amount

More about Correlation

(press p for notes)

More about Correlation

(press p for notes)

More about Correlation

(press p for notes)

More about Correlation

More about Correlation

  • same direction (positive)
  • same magnitude (slope of the line is the same)
  • different strength: plot 1 has stronger correlation than plot 2.

More about Correlation

More about Correlation

  • different direction: plot 1 is positive, plot 2 is negative
  • same magnitude (slopes are same but in opposite directions)
  • same strength

Correlation Errors

Correlation Errors

Just like with measurement and sampling, correlation can suffer from two types of errors:

  • They are errors in this sense that \(\mathrm{Causation(X,Y)_{True} - Correlation(X,Y)_{Observed}} \neq 0\): The true causal relationship between X and Y is different from the observed correlation between X and Y.

Correlation Errors

Like with measurement errors and sampling errors, there are two kinds of correlation errors:

  1. random errors: we observe patterns between X and Y that differ from the true causal relationship by chance
  2. bias: there is a systematic process that means we consistently observe a correlation between X and Y that is different from the true causal effect of X on Y.

As with measurement and sampling error, it is easier to address problems of random errors than it is to address bias.

Correlation: Random Errors

(press p for notes)

Correlation: Random Errors

press p for notes

Correlation: Random Errors

(press p for notes)

Correlation: Random Errors

Correlation: Random Errors

How do we address the problem of random errors in correlation?

We use statistics to calculate: how likely is the pattern between X and Y to occur by chance.

Field of statistics investigates properties of chance events (stochastic processes):

  • Probability theory tells us how likely events are to happen, given chance
  • Can tell us how likely correlation of some value is to happen by chance

Correlation: Random Errors

How do we know how likely a correlation is to occur by chance?

  1. Compute correlation of \(X\) and \(Y\)
    • Stronger correlations less likely to occur at random
  2. How many cases do we have?
    • Patterns with many cases less likely to occur at random
  3. Assign a probability that the correlation we see would have happened by chance

(press p for notes)

Correlation: Random Errors

On the following slides, I use a computer to randomly generate values of X and Y using unrelated processes. Any correlation between X and Y is thus entirely by chance.

I repeat this process until I find a very strong correlation between X and Y. Then I produce a scatterplot of the strong correlation produced at random and the number of random tries it took to get the strong correlation (by chance).

Correlation: Random Errors

Correlation: Random Errors

Correlation: Random Errors

Correlation: Random Errors

Correlation: Random Errors

Correlation: Random Errors

Correlation: Random Errors

Same Correlation, More cases

Correlation: Random Errors

statistical significance:

An indication of how likely correlation we observe could have happened purely by chance.

higher degree of statistical significance indicates correlation is less likely to have happened by chance

Correlation: Random Errors

\(p\) value:

  • A numerical measure of statistical significance. Puts a number on how likely observed correlation would have occurred by chance, assuming we know the chance procedure and the truth is no correlation between X and Y.

  • It is a probability, so is between \(0\) and \(1\).

  • Lower \(p\)-values indicate greater statistical significance. (Lower \(p\) values for stronger correlations, more cases)

(press p for notes)

Correlation: Random Errors

\(p < 0.05\) often used as threshold for “significant” result.

  • but it is not a magic number
  • Can observe \(p < 0.05\) by chance (\(\frac{1}{20}\)th of the time)

Correlation: Random Errors

\(p\) value:

Be wary of “\(p\)-hacking”

  • \(p\) values become meaningless if we look at many associations, then only report the ones that are “significant”.

Why?

  • low \(p\)-values still occur by chance
  • when we look at lots of correlations, we expect to see some low \(p\) values by chance.

Significant?

Significant?

Significant?

What else do you want to know?

We’d want to know this

We’d want to know this

(press p for notes)

Correlation: Random Errors

Recap:

  1. Correlations can appear by chance
  2. We can assess probability of chance correlation if we know:
    • strength of correlation
    • size of the sample (\(N\))
  3. \(p\)-values:
    • Obtained using mathematical formula
    • Are lower when correlations are stronger or number of cases is larger.
    • lower \(p\) values imply statistical significance (less likely to be by chance)

Correlation: Random Errors

So, we can address random errors in correlation by assessing the \(p\)-value of the correlation.

If \(p\)-value is low, we can conclude there is statistical significance (with caution) and that chance of random error driving our correlation is low.

To assess whether relationship between public health interventions and flattening the curve of the Spanish Flu pandemic is due to chance, we need to collect more data (do some extra credit!)

Correlation: Random Errors

Recap:

Statistical
Significance
\(p\)-value By Chance? Why? “Real”?
Low High (\(p > 0.05\)) Likely small \(N\)
weak correlation
Probably not
High Low (\(p < 0.05\)) Unlikely large \(N\)
strong correlation
Probably

Correlation: Random Errors

When looking at public health interventions during the 1918-1919 Spanish Flu epidemic, Hatchett et al find in a correlation that American cities with more interventions had lower peak mortality. And that this relationship would occur by chance about 0.2% of the time.

We can conclude that such a correlation is quite unlikely to be in error by chance.

In the next lecture we will consider whether this correlation might suffer from bias.

Correlation: Bias

As with measurement bias and sampling bias, biases of correlation are more pernicious and hard to solve.

Correlation: Bias

Consider an example:

Does wearing a surgical mask during everyday activities during cold and flu season reduce the risk of contracting a cold or flu?

Correlation: Bias

What if we observe this correlation:

people who wear masks have lower rates of cold and flu infection

Can we conclude that masks cause them to have lower rates of infection?

Correlation: Bias

There are good reasons to think not:

People who wear masks in everyday life might be more cautious about illness in other ways:

  • practice more social distancing
  • know and use better hand hygiene
  • clean surfaces in their home more frequently
  • avoid people who appear ill

This might be because they:

  • have a weakened immune system
  • have some sort of fear of illness or germs

Correlation: Bias

Thus even if masks provide no benefit, people who wear masks might, due to their other practices, already be less likely to contract a respiratory illness than those who do not wear masks.

Thus we would observer a correlation between mask-wearing and lower rates of infection. But it might not be the TRUE causal effect of masks. The observed correlation could suffer from bias.

We will discuss this problem (also called confounding) in more detail in the next lecture.

Internal / External Validity

Internal / External Validity

Finally, when we use the correlation of X and Y to assess the causal effect of X on Y we must consider two aspects of validity:

internal validity: when a correlation has internal validity, it is unlikely that the correlation suffers from bias. It is likely the true causal relationship between X and Y.

external validity: when a correlation has external validity, it correctly describes the causal relationship between X and Y for all the cases in which we are interested.

Internal / External Validity

An Example

Consider these two studies in the context of the question: does wearing surgical masks during everyday activities reduce YOUR (you, specifically) risk of contracting a respiratory infection from a cold or flu virus?

Internal / External Validity

Study 1

Researchers placed surgical masks on dummy heads and exposed them to aerosolized flu viruses.

They recorded the levels of live flu virus in the surrounding air and behind the mask.

They then correlated mask status (behind mask, not behind mask) with the level of live flu virus in the air.

The air behind the mask had, on average, a 6-fold reduction in the amoung of live flu virus.

Internal / External Validity

Study 2

Researchers at the University of Michigan assigned different floors of undergraduate dorms to receive one of three treatments: control (no intervention), face masks (students given masks, a guide to use, and asked to wear in dorms), face masks and hand hygiene (students given masks and hand sanitizer, guide to use, asked to use in dorms).

Students reported flu symptoms and were tested to confirm an influenza infection.

The researchers then correlated the use of masks with flu infection rates. The mask-only group contracted influenza at a rate 1.1 times higher than the control. (The mask and hand hygiene contracted influenza at a rate 0.75 times lower than the control.)

Internal / External Validity

Keep in mind the question: does wearing surgical masks during everyday activities reduce YOUR risk of contracting a respiratory infection from a cold or flu virus?

Which of these two studies has internal validity?

Which of these two studies has external validity?

Internal / External Validity

Keep in mind the question: does wearing surgical masks during everyday activities reduce YOUR risk of contracting a respiratory infection from a cold or flu virus?

Which of these two studies has internal validity?

Study 1 and Study 2

Which of these two studies has external validity?

Study 2

Internal / External Validity

Both studies have strong internal validity: it is believable that the correlations used in the studies correctly recover the true causal relationship between the X (mask wearing) and Y.

In study 1, it is hard to imagine what else other than the mask could reduce the flu virus near the mouth and nose of the dummies used.

In study 2, the use of a randomized experiment allows researchers to reduce or eliminate the possibility of bias. (More on this next lecture)

Internal / External Validity

Only study 2 has strong external validity: Whereas study 1 shows that masks can reduce the amount of live flu virus that reaches the mouth and nose of the wearer; this was conducted in a laboratory using a dummy. It did not reflect real-world exposure to the virus nor the real-world usage of the masks.

Study 2 more directly assesses whether you would benefit from wearing a mask around in your daily activities. The masks were provided to actual students living in dorms; thus it reports the effects of masks for people like you.

Internal / External Validity

This is not trivial: even though Study 1 is believable in finding wearing a mask reduces exposure to live viruses, these effects in laboratories do not translate to the real-world context in which you would use a mask. Study 2, by contrast, assesses the effects of mask wearing on contracting the flu in the context you would use them. And it finds no effect of wearing masks.

(press p for notes)